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ABSTRACT 

Recent experiments provided controversial observa- 
tions that either parallel or non-parallel G-quadruplex 
exists in molecularly crowded buffers that mimic 
cellular environment. Here, we used laser tweezers 
to mechanically unfold structures in a human telo- 
meric DNA fragment, 5'-{TTAGGG) 4 TTA, along three 
different trajectories. After the end-to-end distance 
of each unfolding geometry was measured, it was 
compared with PDB structures to identify the 
best-matching G-quadruplex conformation. This 
method is well-suited to identify biomolecular struc- 
tures in complex settings not amenable to conventional 
approaches, such as in a solution with mixed species or 
at physiologically significant concentrations. With this 
approach, we found that parallel G-quadruplex coexists 
with non-parallel species (1 :1 ratio) in crowded buffers 
with dehydrating cosolutes [40% w/v dimethyl sulfoxide 
(DMSO) or acetonitrile (ACN)]. In crowded solutions 
with steric cosolutes [40% w/v bovine serum albumin 
(BSA)], the parallel G-quadruplex constitutes only 
10% of the population. This difference unequivocally 
supports the notion that dehydration promotes the 
formation of parallel G-quadruplexes. Compared with 
DNA hairpins that have decreased unfolding forces 
in crowded (9 pN) versus diluted (15 pN) buffers, those 
of G-quadruplexes remain the same (20 pN). Such a 
result implies that in a cellular environment, DNA 
G-quadruplexes, instead of hairpins, can stop DNA/ 
RNA polymerases with stall forces often <20 pN. 

INTRODUCTION 

G-quadruplexes have drawn significant research attention 
as their in vivo existence has been demonstrated and their 
potential roles in gene regulation have been unveiled (1-4). 
During the past decade, X-ray crystallography and NMR 



approaches have shown rather versatile G-quadruplex 
conformations. For example, human telomeric 
G-quadruplexes have exhibited at least nine conform- 
ations (5-12). In X-ray crystallography, snapshot struc- 
tures are obtained with limited information on the 
dynamic interaction between the G-quadruplex and 
solvent molecules. NMR can probe structures in a solu- 
tion. However, the relatively high substrate concentration 
required for NMR may not be fully compatible with that 
available in vivo. In a solution with more than one species, 
it becomes rather challenging to reveal the conformation 
of each species using these methods. 

Most of the investigations on the G-quadruplex are 
performed in diluted buffers that have limited biological 
relevance, as cellular environment is filled with ~40% 
biomacromolecules such as proteins. To address this 
issue, molecularly crowded conditions that mimic cellular 
environment are used (13). In these crowded buffers, NMR 
revealed parallel conformation of G-quadruplexes (6). 
However, non-parallel G-quadruplexes were also 
observed (14,15). The controversial observations may be 
ascribed to different experimental conditions such as 
types of cosolutes, concentrations and the sequences of 
G-rich DNA fragments and re-annealing rates (6,16-18). 
Recent investigation by H Ansel et al. (15) cautioned the 
use of polyethylene glycol as a cosolute and suggested 
that proteins or egg extracts are better cosolutes to mimic 
cellular conditions. However, the structural complexity of 
these biomacromolecular cosolutes can interfere with the 
NMR signal (14,15), which thwarts the efforts to resolve 
G-quadruplex structures under these conditions. 

The force measurement by single-molecule approaches 
such as AFM or laser tweezers naturally prevents signal 
interference from the macromolecular cosolutes. Because 
structural probing is performed one at a time, it is 
expected that the information can be acquired for each 
population in a solution mixture. Importantly, the 
superior sensitivity of single-molecule methods enables 
structural probing at DNA concentrations close to 
in vivo levels. As only a few DNA copies contain the 
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sequence of interest in nucleus, this concentration is often 
in the nanomolar range (see Supplementary Data for 
detailed calculation). 

Force-based single-molecule approach has an add- 
itional and unique advantage to reveal mechanical 
properties, such as unfolding force, ^unfold, of a biomol- 
ecule. In diluted buffers, it has been discovered that i^foid 
of G-quadruplexes (19-21) is significantly higher than the 
stall force of many motor proteins, such as DNA and 
RNA polymerases (22—24). In contrast, F un f 0 id of 
hairpins (25), which have a dsDNA stem in the structure, 
is comparable with the stall forces. This suggests that with 
respect to hairpins, G-quadruplexes can serve as a 
stronger road block to the transcription or replication 
processes. However, there is a lack of reports on the mech- 
anical stabilities of either DNA G-quadruplexes or 
hairpins under molecularly crowded conditions, which 
prevents further interpretation of the biological relevance 
of these secondary structures. 

Here, we describe a laser-tweezers-based single-molecule 
approach to quickly identify each G-quadruplex conform- 
ation and reveal its mechanical stability in a molecularly 
crowded buffer at physiologically relevant concentrations 
of a human telomere sequence, 5'-(TTAGGG) 4 TTA. 
Assisted with a statistical deconvolution method [integrated 
Population Deconvolution at Nanometer resolution, or 
iPoDNano, (26)], this approach identifies G-quadruplex 
species by determining the handle-to-handle distances 
among three nucleotides arranged in a triangle. To discrim- 
inate the dehydration or the excluded-volume effect on the 
formation of specific G-quadruplex structures, we have 
analyzed population distribution in different crowded 
buffers. In dehydration buffers that contain either 40% 
(w/v) DMSO or 40% (w/v) ACN as cosolutes, parallel G- 
quadruplex coexists with hybrid 1 or basket G-quadruplex, 
respectively, with a ratio of 1 : 1 . In buffers that mainly dem- 
onstrate steric effects [40% w/v bovine serum albumin 
(BSA)], we have identified a parallel (10%) and a hybrid 1 
G-quadruplex (90%). Previously, the large quantity of BSA 
in this buffer deteriorated NMR signals, which failed to de- 
termine G-quadruplex structures explicitly. Our experiments 
also reveal for the first time that mechanical stabilities of 
human telomeric G-quadruplexes remain the same, 
whereas those of a DNA hairpin significantly decrease, in 
crowded buffers with respect to diluted solutions. 

MATERIALS AND METHODS 

DNA oligonucleotides used in this study were purchased 
from Integrated DNA Technologies (IDT, Coralville, IA). 
Enzymes were purchased from New England Biolabs 
(NEB, Ipswich, MA), and all the chemicals (>99% in 
purity) were purchased from VWR (West Chester, PA). 
The surface functionalized beads for the single-molecule 
experiments were obtained from Spherotech (Lake 
Forest, IL). 

DNA constructs 

All the DNA sequences (except the dsDNA handles) 
including azide or terminal alkyne groups at specific 



nucleotides are shown in Supplementary Table SI. The 
[5'-3'] (Ul) DNA construct for the mechanical unfolding 
of human telomeric G-quadruplex in the sequence, 
5'-TTA G'G 2 G 3 T 4 T 5 A 6 G 7 G 8 G^ t 10 T u A 12 G 13 G 14 G 15 
T i6 T n A i8 G i9 G 2o G 2i TTA . 3 / 5 was san dwiched between 

two double-stranded DNA (dsDNA) handles using pro- 
cedures described previously (19). Briefly, one of the 
dsDNA handles 2690 bp in length was prepared from 
the pEGFP vector (Clontech, Mountain View, CA) by 
digestion with SacI and EagI restriction enzymes. The 
SacI end was functionalized with digoxigenin (Dig- 
dUTP, Roche) through terminal deoxynucleotidyl trans- 
ferase (Fermentas). The other dsDNA handle of 2028 bp 
in length was obtained from polymerase chain reaction of 
a pBR322 template using a biotinylated primer and a 
primer containing either an Xbal restriction site or a 
click chemistry functional group (Supplementary Table 
SI). Literature procedures were followed to prepare the 
[5'-L2] (U2) and [3'-L2] (U3) constructs (27). Briefly, an 
EagI overhang was introduced at the 5' dsDNA terminal 
(for U2) or the 3' dsDNA terminal (for U3) in a dsDNA- 
single-stranded DNA (ssDNA) hybrid that contains the 
hTel4G sequence. The respective dsDNA termini were 
then ligated to the 2690 bp dsDNA handle using T4 
DNA ligase. The Til at the central loop (L2) of the 
hTel4G fragment is replaced with an azide-modified 
uridine for U2 and U3 constructs (Supplementary Table 
SI). The 2028 bp dsDNA handle was then attached to the 
azide group through the copper-catalyzed azide-alkyne 
cycloaddition reaction (28). The resultant DNA was 
purified by ethanol precipitation. This DNA construct 
contains a single-stranded hTel4G sequence linked to 
one dsDNA handle through a phosphodiester bond and 
another handle through the click chemistry connection. By 
replacing the hTel4G sequence with the sequence, 
5'-GC(T) 19 GCTTTTGC(A) 19 GC, a construct that 
contains a DNA hairpin was similarly prepared. 

Single-molecule experiments 

The mechanical unfolding of G-quadruplex in hTel4G was 
performed in a home-built dual-trap 1064nm laser 
tweezers (29) at room temperature under different buffer 
conditions (10 mM Tris buffer that contains 100 mM of 
either NaCl or KC1 species and 40% w/v DMSO, ACN 
or BSA at pH 7.4). Before the unfolding, the DNA con- 
struct was immobilized onto a 2.10 um bead through the 
digoxigenin-anti-digoxigenin-antibody complex. The 
DNA-immobilized bead and streptavidin-coated bead 
were trapped by two laser foci separately. Next, we 
brought one of the laser traps close to the other through 
a motorized steerable mirror to tether the DNA construct 
between the beads. The tethered DNA was extended 
below the denaturation plateau force (see later, we set 
the maximum force <60pN in diluted buffers and 
~50pN in DMSO-, ACN- or BSA-containing buffers), 
and relaxed to 0 pN in ~10s by moving one of 
the trapped beads with a loading rate of 5.5pN/s. 
The force-extension (F-X) curves were recorded at 
1000 Hz using a Labview® program. The raw data were 
filtered with a Savitzky-Golay function, with a time 
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constant of 10 and 50 ms, respectively, for buffers without 
and with cosolutes using a Matlab program (The 
MathWorks, Natick, MA). 

After each pair of beads is trapped, the trap stiffness of 
each laser focus is calculated based on the thermal motion 
of the trapped bead using a reported method (29). The 
trap stiffness information in diluted and molecularly 
crowded buffers is summarized in Supplementary Table 
S2. The accuracy of this instrument on the force measure- 
ment was confirmed by the observation of the 65 pN 
plateau that indicates the denauration of single dsDNA 
molecules in a lOmM Tris buffer (pH 7.4) with lOOmM 
KC1 (30). The accuracy of the instrument on the distance 
measurement was confirmed by the determination of the 
contour length per DNA nucleotide using the method 
described in literature (25). The value, 0.44±0.02nm 
per nucleotide, is identical with the literature (25) and 
within the range measured by other approaches (31-34). 

Plot of change-in-contour-length versus force 

After each F-X curve was split into the stretching (red) and 
relaxing (black) traces (Figure IB), the change in exten- 
sion (Ax) was calculated by subtracting the former trace 
from the latter at a particular force (F). The resulting Ax 
at the particular force was then converted to the 
change-in-contour-length (AL) using the following 
equation based on the Worm-Like-Chain (WLC) (35) 
model when the contour length of the secondary structure 
is much less than that of the dsDNA handles (<5%) 
(27,35): 



Ax 

aZ 



i 



1 k B Ti F 
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where AL reflects the change in the apparent contour 
length at F = 0 pN between the two pulling handles 
before and after a structure is unfolded, k B is the 
Boltzmann constant, T is absolute temperature^ F is the 
force, P is the persistent length and S is the elastic stretch 
modulus. To determine the P and S under different buffer 
conditions, we first fit F-X curves using a sequential WLC 
model that accounts for both the dsDNA handles and the 
secondary structure tethered in between (36) (see 
Supplementary Figure SI for F-X curves and fittings). 
The results of these fitted parameters (Supplementary 
Table S2) are then used here to obtain the AL values in 
a specific buffer. The AL values from this approach are 
identical within experimental error with those obtained 
from the WLC sequential fitting described previously 
(Supplementary Table S2). 

Kernel density treatment and bootstrap analysis 

The kernel density treatment and bootstrap analysis on 
AL populations were performed as described in the litera- 
ture (26,37). Briefly, for kernel density treatment, the 
probability density of each transition between a folded 
structure and an unfolded ssDNA was estimated accord- 
ing to a Gaussian kernel (38). A kernel density histogram 
was obtained by adding the probability density for each 
transition (see Figure ID for example). From each kernel 
density plot, the highest two peaks were identified by Igor 



(WaveMatrics, Portland, OR) program. A total of 3000 
random re-sampling was performed to construct a boot- 
strapping histogram of selected peaks (see Figure 2A for 
example). When more than one population was observed 
in kernel density histograms, each population was fitted 
with a Gaussian to estimate the population from the area 
under the curve. The probability of each population in the 
bootstrapping histogram was normalized to that in the 
kernel density histogram. 

Root mean square deviation calculation 

The G-quadruplex formed in the hTel4G was identified 
by comparing the handle-to-handle distances (x) 
measured in the laser-tweezers experiment with those 
obtained from the NMR or X-ray G-quadruplex struc- 
tures (Supplementary Table S3). Using AL obtained 
from the laser- tweezers measurements, we first calculated 
the x value according to the expression, 
x = L — AL = n x L nt — AL (equation 2) (27,39), where 
L and L nt are the contour lengths for a folded structure 
and single nucleotide, respectively, and n is the number of 
nucleotides in the folded structure. All x values were 
evaluated against corresponding distances measured 
from NMR or X-ray structures using the root mean 
square deviation (RMSD). We performed two rounds of 
RMSD calculations. In the first round, we calculated the 
RMSD using an average L nt (0.44 nm) obtained from all 
available G-quadruplex conformations based on the AL 
obtained from the 5'-3' geometry in our laser-tweezers 
measurement. In the second round, we used the L nt of 
the conformation that showed the least RMSD in the 
first round to recalculate RMSD. These RMSD compari- 
sons have been presented in Figures 2, 3 and 4. The visual 
comparison of the x values for laser-tweezers measure- 
ments and those from NMR or X-ray structures has 
been shown in Supplementary Figure S2. 

Deconvolution of populations 

For the deconvolution of two populations in a AL histo- 
gram, the overall population was fit with a two-peak 
Gaussian function. Based on the ratio determined by 
this fitting, the data under the intersection region were 
randomly assigned to one of the populations. After this 
random assignment, we plotted the unfolding force histo- 
grams in Supplementary Figures S3, S4 and S5. 

Calculation of change in free energy of unfolding 

The change in free energy of unfolding (<4G un f 0 id) for the 
DNA hairpin was calculated from the Jarzynski 
non-equilibrium theorem (40) (see equation 3). The 
AG un f 0 id for each G-quadruplex population was cal- 
culated using the same equation after the population de- 
convolution described above. 



AG 



unfold 



-k B T\nJ2 



exp 



k B T. 



(3) 



where N is the number of observations, and W is the 
non-equilibrium work done to unfold the G-quadruplex 
structure(s), which is equivalent to the hysteresis area 
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Figure 1. Mechanical unfolding of human telomeric G-quadruplex in 
100 mM Na + buffer (lOmM Tris, pH 7.4). (A) Schematic of the laser- 
tweezers setup and the triangulation structural probing approach. 
Three unfolding geometries. 5'-3' (Ul), 5'-L2 (U2) and 3'-L2 (U3), 
are highlighted in a dark-blue triangle. (B) A typical force-extension 
(F-X) curve containing unfolding (red) and refolding (black) of the Ul 
trajectory. Inset shows the unfolding event. (C) Plot of the change in 
contour length (AL) versus force (9-27 pN) based on the F-X curve in 
B. Histogram shown to the right depicts folded (AL~8.0nm) and 
unfolded (AL = Onm) populations. (D) Kernel density plot for the 
Ul geometry. The black dotted curve represents a two-peak Gaussian 
fit, whereas the black and green curves are Gaussian fits for the minor 
and major populations, respectively. 
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Figure 2. G-quadruplex structural identification in a lOOmM Na + 
buffer (10 mM Tris, pH 7.4) using bootstrap and RMSD analyses. 
(A) Bootstrapping AL analysis in the geometries of Ul (top), U2 
(middle) and U3 (bottom). The partially and the fully folded species 
in Ul are indicated with black and green colors, respectively. Each 
histogram is fitted with a Gaussian. N depicts the number of experi- 
ments. (B) RMSD comparison between the handle-to-handle distance 
from the laser-tweezers measurements and those from PDB structures. 
The least RMSD value is depicted by a green arrow. (C) The best- 
matching G-quadruplex (PDB: 143D) with a highlighted triangle. 



between the stretching and relaxing F-X curves. The bias 
of the AG un f 0 id was calculated from the unfolding work 
histograms, as described in the literature (41). Due to the 
presence of the G-triplexes, which may display the same 
AL as that of the G-quadruplexes in the U2 or U3 trajec- 
tory (Supplementary Figure S6), the observed W contains 
contributions from G-triplexes. To estimate AG\ inf - 0 id in 
these cases, we used the Ul unfolding in the same buffer 



as a reference. As Ul provides distinct G-triplex and 
G-quadruplex populations (Figures 2A, 3A and 4A), we 
obtained AG un f 0 id for each species using equation 3. Next, 
assuming inseparable G-triplexes and G-quadruplexes, we 
combined the unfolding work for both species and 
calculated an apparent AG un f 0 id for the Ul geometry. 
The difference between this value and the AG un f 0 id f° r 
the G-quadruplexes serves as the correction factor to 
retrieve the AGu nfold for the fully folded G-quadruplexes 
in the U2 or U3 geometry (Table 1). 



RESULTS AND DISCUSSION 

The triangulation structural identification method 

By attaching an ssDNA fragment, 5'-(TTAGGG) 4 TTA-3' 
(or hTel4G), between two optically trapped beads 
(Figure 1A), we can unfold the structure formed in the 
sequence from the 5' to 3' direction after moving the two 
beads apart (19,42). Change-in-contour-length (AL) due 
to the structural unfolding can be retrieved by comparing 
the F-X curves before and after the unfolding transition 
(Figure IB). We obtain the distance (x) between the two 
dsDNA handles that hold the structure using equation 2 
(see Materials and Methods for details). This handle- 
to-handle distance can be compared with that measured 
from NMR or X-ray crystallography to determine the 
best-matching conformation. Previously, we inferred 
G-quadruplex structures in the insulin-linked polymorphic 
region on the basis of this measurement in a one- 
dimensional space (19). By using dsDNA handles 
through different residues in a biomolecule, in principle, 
multidimensional distance measurements can be obtained 
to reconstruct a complete topology of the structure formed 
in the biomolecule. Nevertheless, such a complete struc- 
tural profiling is time-consuming and redundant if the task 
is to identify the best-matching conformation only (27,39). 
With a balance of measurement time and accuracy of 
structural information, we determined three inter-handle 
distances arranged in a triangle (Figure 1A) to identify 
G-quadruplex conformation in the hTel4G. 

To better represent the entire structure, we chose three 
well-separated nucleotides from which handles are 
attached: two terminal guanines and the middle thymine 
in the central loop (L2) (Figure 1A and Supplementary 
Figure S6). Whereas the two terminal handles were 
introduced through phosphodiester bonds by enzymatic 
ligations (27), the middle thymidine in the central loop 
was substituted with an azide-modified uridine to allow 
the click chemistry attachment (see Materials and 
Methods) (27). By grabbing two handles among the 
three nucleotides, a total of three different unfolding 
geometries can be obtained. We designate 5' to 3' 
geometry as the Ul, 5' to L2 as the U2 and 3' to L2 as 
the U3 trajectory. 

The reliability of the method is dependent on the 
accurate distance or AL measurement. The accuracy of 
our laser-tweezers measurement was confirmed by unfold- 
ing a DNA hairpin using a reported procedure (25), which 
revealed an identical contour length per nucleotide 
(see Materials and Methods for details). The capability of 
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Figure 3. G-quadruplex structural analysis in crowded solutions (pH 7.4, 100 mM K + ) with small-molecule cosolutes (40% w/v). (A) Population (top 
three panels) and RMSD (bottom panel) analyses of folded species in 40% DMSO. Blue bars represent the RMSD for the major species 
(AL - 8.5 nm) in Ul and the ~2nm populations in U2 and U3. Red bars represent the major species (AL = 8.5 nm) in Ul and the ~3.8nm 
populations in U2 and U3. N depicts the number of experiments. (B) The best-matching G-quadruplex structures labeled with triangulation. 
(C) Population (top three panels) and RMSD (bottom panel) analyses of folded species in the same buffer with 40% (w/v) ACN. The j-axes of 
the top three panels are the same as (A). N depicts the number of experiments. Blue bars represent the RMSD for the major species (AL = 7.9 nm) in 
Ul and the ~2nm populations in U2 and U3. Green bars represent the major species (AL = 7.9 nm) in Ul and the ~4.0nm populations in U2 and 
U3. The least RMSD values are indicated by arrows in (A) and (C). 




Figure 4. G-quadruplex conformational analysis in crowded solutions (pH 7.4, 100 mM K + ) with biomacromolecular cosolutes (40% w/v BSA). 
(A) Population analysis for the Ul (top), U2 (middle) and U3 (bottom) unfolding trajectories. N depicts the number of experiments. (B) RMSD 
analysis for the best-matching G-quadruplex structures (indicated by arrows). Blue bars represent the RMSD for the major species (AL = 8.5 nm) in 
Ul and the ~2nm populations in U2 and U3. Red bars represent the major species in Ul and the ~3.8nm populations in U2 and U3. (C and D) 
Structures of the best-matching G-quadruplexes. 
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Table 1. Unfolding force (-Funibidj i n pN), change in free energy of unfolding (AG unfokl , in kcal/mol) and % population of G-quadruplexes under 
different conditions 
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this method to differentiate different population is 
dependent on the resolution in the AL measurements, 
which can be improved by repetitive sampling from many 
DNA tethers. This practice allows constructing AL histo- 
grams, from which the most likely AL can be obtained at 
sub-nanometer resolution (Figure 1C, green curve). 

We tested this approach in a lOmM Tris buffer with 
lOOmM Na + (pH 7.4), which is known to populate a 
basket-type G-quadruplex in the hTel4G (10). The AL 
histogram displays two populations (Figure 1C). To 
identify each population, we performed an integrated 
population deconvolution with nanometer or sub-nano- 
meter resolution (or iPoDNano) (26,37), in which a 
kernel density treatment reveals the probability distribu- 
tion of all measured AL values (Figure ID), and a boot- 
strap analysis depicts the most probable AL populations 
in the overall AL data set (Figure 2A). This deconvolution 
clearly shows two populations with AL of ~5.1nm 
(Figure 2A, top panel, black) and ~8.0nm (green). 
Previously, the former species was assigned as a partially 
folded G-triplex and the latter as a fully folded basket 
G-quadruplex (21). However, only one species with AL 
of 4.1 or 4.3 nm was observed for the U2 or the U3 un- 
folding, respectively (Figure 2A). Calculations showed 
that along these two trajectories, expected AL for the 
G-triplex unfolding is either identical with that for the 
G-quadruplex, or smaller than 1 nm, which is well below 
the AL values observed previously (see Supplementary 
Figure S6). Taking into account that G-triplex only con- 
stitutes 25% of the overall population (Figure 2A), the 
majority of the AL populations in the U2 or the U3 
trajectory were assigned to the G-quadruplexes. Even in 
the case that G-triplex has a population comparable with 
G-quadruplex (see later), the AL populations in U2 



and U3 can still be reliably assigned to G-quadruplexes 
due to the identical AL values between G-triplex and 
G-quadruplex species (Supplementary Figure S6). 

The handle-to-handle distances calculated from these 
two AL values and that from the 8.0 nm population in 
the Ul (equation 2) were compared with those of the 
NMR or X-ray structures (see Materials and Methods) 
(5-12). Although many PDB structures do not have the 
same flanking sequence as the hTel4G, they all share the 
core G-quadruplex forming sequence, (GGGTTA) 3 GGG, 
which provides a solid justification for our comparison 
method. The RMSD between these two data sets 
(Figure 2B) reveals the best-matching structure is indeed 
the basket G-quadruplex (PDB: 143D, Figure 2C), which 
has been shown to populate in the core (GGGTTA) 3 GGG 
sequence in the same buffer by NMR studies (10). Such a 
result strongly suggests that dsDNA handles introduced by 
enzymatic ligations or click chemistry reactions do not sig- 
nificantly interfere with the folding of G-quadruplexes. 
Similar approach with six inter-handle distances confirmed 
the expected formation of a hybrid 1 G-quadruplex in the 
hTel4G in a K + buffer (27). These findings validated our 
structural identification method for DNA structures. 

Dehydration effect on the G-quadruplex conformation 

Next, we set out to probe the G-quadruplex structures in 
crowded solutions with 40% (w/v) small-molecule 
cosolutes, such as DMSO or ACN, which are known to 
contribute to the molecular crowding mainly through de- 
hydration effect (6,14). Similar to the Ul geometry in the 
dilute solution (Figure 2A), both partially (5.0 nm) and 
fully folded (8.5 nm) populations were observed in 40% 
DMSO solution (pH 7.4) with 100 mM K + (Figure 3A, 
top panel). To our surprise, two AL species with a 
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similar population ratio were revealed in either U2 or U3 
trajectory (48%:52% for the 2.0:3.9 nm population in 
U2 and 44%:56% for the 2.1:3.8 nm species in U3, see 
Table 1, Figure 3 A and Supplementary Figure S3). As 
G-triplexes are population minorities (Figure 3A), the 
two species are most likely different G-quadruplexes. 
The similar population ratio suggests that the 2.0 nm 
population in U2 is the same G-quadruplex as that of 
the 2.1 nm population in U3, whereas the 3.9 nm 
G-quadruplex in U2 is identical to that of the 3.8 nm 
species in U3. This assignment is confirmed by comparing 
the difference in the handle-to-handle distance between 
the U2 and U3 trajectories measured from NMR/X-ray 
structures (Supplementary Table S3, column 5 in top 
panel) with that determined by mechanical unfolding ex- 
periments (Supplementary Table S3, bottom panel). 
Compared with the assignment of a ~2nm species and a 
~4nm species to the same population, assignment of the 
two ~2 nm species as one population and that of the two 
~4nm species as another (Supplementary Table S3, 
bottom panel) both produce differences in the 
handle-to-handle distance much closer to those expected 
from the NMR/X-ray measurement. One exception is 
for the PDB 2KF8 species, which is a rather unusual 
two-stack G-quadruplex (11). With this assignment, 
the RMSD was calculated (Figure 3A, bottom panel), 
which indicates the ~2.0nm species as the parallel 
G-quadruplex (PDB: 1KF1 and 2LD8 in Figure 3B 
and Supplementary Figure S2) and the ~3.8nm popula- 
tion as the hybrid 1 G-quadruplex (PDB code: 2HY9, 
Figure 3B and Supplementary Figure S2). The formation 
of G-quadruplexes was confirmed by circular dichroism 
(CD) in the same solution (Supplementary Figure S7), 
which shows ~265- and ~295-nm peaks indicative of 
G-quadruplex conformations (43). 

Similarly, two populations were also observed in 40% (w/ 
v) ACN solutions (Figure 3C). Using the same strategy 
described above (Supplementary Table S3), we assigned 
the 1.9 nm species in U2 and 2.1 nm species in U3 as one 
population, whereas the 4.0 nm species in U2 and 3.9 nm 
species in U3 as another. Subsequent RMSD treatments 
allowed us to determine the best-matching structures as 
the parallel and the basket G-quadruplexes (Figure 3C 
and Supplementary Figure S4). It is noteworthy that 
the smallest RMSD values used for the structure identi- 
fication in DM SO or ACN were not bigger than those 
calculated from the assignment of a ~2nm species and a 
~4 nm species as the same population (see above), thereby 
corroborating current assignment. Previously, either 
parallel or non-parallel G-quadruplex was observed in 
this solution (6,14). Our observations that both types of 
G-quadruplexes exist in the same crowded solution (Table 
1) at 23°C run against previous findings obtained at higher 
DNA concentrations with different flanking sequences 
(6,14,44,45). As re-annealing time in our mechanical 
unfolding experiments is much shorter than CD or NMR 
measurements (< 1 min versus tens of minutes), we reasoned 
that concentration, re-annealing time and flanking 
sequences of DNA may serve as important factors for the 
formation of different G-quadruplexes. 



Steric effect on the G-quadruplex conformation 

DMSO and ACN contribute to the molecular crowding 
effects mainly through water depletion (6,14). Inside the 
cell, the excluded-volume effect from biological macro- 
molecules, such as proteins, contributes to the crowding 
effects (46). However, it is difficult to interpret 
G-quadruplex conformations amidst background signals 
from complex protein structures in conventional methods 
such as NMR (14,15) and CD. The method described here 
is relied on the distance measurements from the DNA 
molecules specifically tethered between two trapped 
beads (Figure 1A), which prevents the inference from 
cosolutes. To demonstrate this, we used 40% (w/v) BSA 
as the biomacromolecular cosolute. Similar to the 
small-molecule cosolutes, parallel G-quadruplex was 
observed (the ~2nm species in Figure 4A, middle and 
bottom panels). However, this conformation only consti- 
tutes 10% of the overall population (see Supplementary 
Figure S5 for population deconvolution), whereas the 
majority is the hybrid 1 G-quadruplex (the ~4nm popu- 
lation in Figure 4 and Table 1). 

The excluded-volume effect is expected to accommodate 
a conformation with smaller size. Based on the fact that 
parallel G-quadruplex is not the most compact conform- 
ation among telomeric G-quadruplexes with known 
structures (14), the steric effect from the crowding envir- 
onment was ruled out as the driving force for parallel 
G-quadruplex formation. Instead, dehydration effect was 
proposed as a predominant factor, which was supported 
by the identification of parallel G-quadruplex by NMR in 
buffers with dehydrating cosolutes such as DMSO or 
ACN (6). A critical test for this hypothesis is to show 
that non-parallel G-quadruplexes are major species in a 
crowded solution that mainly demonstrates excluded- 
volume effect, such as in a 40% BSA buffer. Due to the 
signal broadening in NMR, however, limited success 
achieved so far was only suggestive that non-parallel 
G-quadruplexes exist in these solutions (14,15), although 
the exact conformations are yet to be obtained. The coex- 
istence of a hybrid 1 and a parallel G-quadruplex with a 
9:1 ratio observed here represents the first definite 
conformation identification in these solutions. Compared 
with the fact that ~50% of the population is parallel 
G-quadruplex in crowded buffers that contain 
dehydrating cosolutes, the 10% parallel conformation in 
the BSA buffer provides direct experimental evidence that 
dehydration is the driving force to induce parallel 
G-quadruplexes than the excluded-volume effect. 

Mechanical and thermodynamic stabilities of the 
G-quadruplex species 

In addition to the structural interpretation, force-based 
single-molecular approach can provide unique information 
on the mechanical stability of a folded structure and the 
AGunfoid (20,47). Whereas the mechanical stability of 
dsDNA in a sliding-like geometry (5—3') significantly de- 
creases from 65 to 52 pN, at which dsDNA denatures 
(30), the rupture force (-F un f 0 i d ) of G-quadruplexes remains 
the same (~20pN) in a crowded buffer compared with a 
diluted solution (Table 1 and Supplementary Figures S3, S4 
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and S5). When a DNA hairpin is unfolded along the same 
geometry as that of the G-quadruplex, the rupture force also 
shows a substantial decrease in a crowded solution (9pN) 
compared with that in a diluted buffer (15 pN) 
(Supplementary Figure S8). It is noteworthy that other 
mechanical properties, such as persistence length (P) and 
elastic stretch modulus (S) of dsDNA handles and 
unfolded ssDNA, do not vary significantly between different 
buffers (Supplementary Table S2). These results indicate 
that in crowded buffers, iWoid of G-quadruplexes is 
higher than the stall force of DNA or RNA polymerases 
(<20 pN), whereas that of hairpins is within the reach of the 
load force of these polymerases. It is expected, therefore, 
that G-quadruplex can serve as an effective mechanical im- 
pediment to the DNA replication or RNA transcription 
processes. In contrast, DNA hairpins are likely to be 
dissembled during these cellular activities. 

To estimate the AG un f 0 id, we used the Jarzynski 
non-equilibrium theorem (see equation 3 in Materials 
and Methods). As shown in Table 1, different unfolding 
geometries of the same species provide similar AG un f 0 id, 
confirming that free energy is a state function independent 
of reaction pathways (27). The population ratio of the 
basket versus parallel G-quadruplex in the ACN buffer 
and that of the hybrid 1 versus parallel G-quadruplex 
in the DMSO solution well reflect the thermodynamic 
equilibrium determined by corresponding AG unto id 
values (Table 1). As the radii of gyration for basket and 
hybrid 1 G-quadruplexes are among the smallest 
for G-quadruplexes with known structures (14), the 
excluded-volume effect may explain their prevalence in 
crowded buffers. Comparison of AG un f 0 id reveals that 
the basket G-quadruplex displays a 23% increase in sta- 
bility, whereas the hybrid 1 G-quadruplex shows a 17% 
decrease, in crowded buffers compared with diluted solu- 
tions (Table 1). The increase in the stability of the basket 
G-quadruplex is expected, as Hoogsteen base pairs are 
stabilized by molecular crowding effects (48). The 17% 
decrease in the stability of the hybrid 1 G-quadruplex is 
surprising. We notice that deconvolution of G-quadruplex 
from the inseparable G-triplexes may present an uncer- 
tainty in AG un foid estimation (see 'Materials and 
Methods' section). It is also possible that cosolutes may 
have specific recognitions on different G-quadruplex 
species (15). On the other hand, the 18% decrease in the 
thermodynamic stability for a DNA hairpin (7.0 — > 
5.3kcal/mol, from the Jarzynski calculation), which con- 
stitutes a long dsDNA stem (see Materials and Methods 
for the sequence), fully agrees with the finding that 
Watson-Crick base pairing has a 14.8% decrease in sta- 
bility in crowded buffers (48). It is also consistent with the 
decrease in rupture force observed above. 

CONCLUSIONS 

In summary, by using a single-molecule structural probing 
method, we have been able to identify a mixture of parallel 
and non-parallel G-quadruplexes in the same molecularly 
crowded solution at physiologically relevant DNA concen- 
trations. Our method has provided direct evidence that 



dehydration effect is the predominant factor to populate 
parallel G-quadruplexes. In contrast to the DNA hairpins 
that show a significant decrease in the mechanical stability 
in crowded buffers versus diluted solutions, we have 
revealed that the rupture force of G-quadruplex, which is 
higher than the stall force of DNA or RNA polymerase, 
remains the same. This implies that in a cellular environ- 
ment, G-quadruplex serves as an effective mechanical im- 
pediment for replication or transcription processes, whereas 
DNA hairpins are not likely to stop DNA or RNA poly- 
merases due to their weaker mechanical stability with 
respect to the stall force of the polymerases. We anticipate 
that the single-molecule approaches described here can 
serve as a quick structural identification method for 
species with known PDB structures, especially for those 
that exhibit structural polymorphism in the same buffer 
or display distinct structures under conditions not 
amenable to conventional structural probing methods. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online: 
Supplementary Tables 1-3 and Supplementary Figures 1-8. 
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